Clustering Crowds
نویسندگان
چکیده
We present a clustered personal classifier method (CPC method) that jointly estimates a classifier and clusters of workers in order to address the learning from crowds problem. Crowdsourcing allows us to create a large but low-quality data set at very low cost. The learning from crowds problem is to learn a classifier from such a lowquality data set. From some observations, we notice that workers form clusters according to their abilities. Although such a fact was pointed out several times, no method has applied it to the learning from crowds problem. We propose a CPC method that utilizes the clusters of the workers to improve the performance of the obtained classifier, where both the classifier and the clusters of the workers are estimated. The proposed method has two advantages. One is that it realizes robust estimation of the classifier because it utilizes prior knowledge about the workers that they tend to form clusters. The other is that we can obtain the clusters of the workers, which help us analyze the properties of the workers. Experimental results on synthetic and real data sets indicate that the proposed method can estimate the classifier robustly. In addition, clustering workers is shown to work well. Especially in the real data set, an outlier worker was found by applying the proposed method.
منابع مشابه
Wised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge
The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...
متن کاملWised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge
The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...
متن کاملBehavior-based Clustering for Discrimination between Flash Crowds and DDoS Attacks
We propose discrimination methods that classify cluster of traffic behaviour of flash crowds and DDoS attacks such as traffic pattern and characteristics and check cluster randomness. The behavior-based clustering consolidates packet into clusters based on similarity of observed behavior, e.g., source IPs are clustered together based on their pattern of destination port usage. The main objectiv...
متن کاملVote Aggregation as a Clustering Problem
An important way to make large training sets is to gather noisy labels from crowds of non experts. We propose a method to aggregate noisy labels collected from a crowd of workers or annotators. Eliciting labels is important in tasks such as judging web search quality and rating products. Our method assumes that labels are generated by a probability distribution over items and labels. We formula...
متن کاملİnsan Kalabalıklarının Baskın Kümeler Tabanlı Analizi Dominant Sets Based Analysis of Human Crowds
Due to recent advances in new camera technologies and the Internet, millions of videos can be easily accessed from any place at any time. A significant amount of these videos are for surveillance, and include actors such as humans and vehicles performing different actions in dynamic scenes. The goal of this study is to analyze human crowd motions in videos. More specifically, moving humans are ...
متن کامل